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Abstract 

In recent years, considerable advances have been made in the study of properties of metric 
spaces in terms of their doubling dimension. This line of research has not only enhanced our un- 
derstanding of finite metrics, but has also resulted in many algorithmic applications. However, 
we still do not understand the interaction between various graph-theoretic (topological) prop- 
erties of graphs, and the doubling (geometric) properties of the shortest-path metrics induced 
by them. For instance, the following natural question suggests itself: given a finite doubling 
metric (V,d), is there always an unweighted graph (V' , E') with V C V' such that the shortest 
path metric d' on V' is still doubling, and which agrees with d on V . This is often useful, given 
that unweighted graphs are often easier to reason about. 

A first hurdle to answering this question is that subdividing edges can increase the doubling 
dimension unboundedly, and it is not difficult to show that the answer to the above question 
is negative. However, surprisingly, allowing a (1 + e) distortion between d and d! enables us 
bypass this impossibility: we show that for any metric space (V, d), there is an unweighted graph 
(V, E') with shortest-path metric d' : V X V -> M> such that 

• for all x,y GV, the distances d(x, y) < d'(x, y) < (1 + e) • d(x, y), and 

• the doubling dimension for d! is not much more than that of d, where this change depends 
only on e and not on the size of the graph. 

We show a similar result when both (V, d) and (V, E') are restricted to be trees: this gives a 
simple proof that doubling trees embed into constant dimensional Euclidean space with constant 
distortion. We also show that our results are tight in terms of the tradeoff between distortion 
and dimension blowup. 
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1 Introduction 



The algorithmic study of finite metrics has become a central theme in theoretical computer science 
in recent years. Of particular interest has been the study of the geometry of metrics — embeddings 
into Minkowski spaces have been the most obvious example, accompanied by the study of notions 
of metric dimension which have allowed us to partially quantify geometric properties that make 
metrics tractable for several algorithmic problems. 

Given these advances in our understanding of the geometric properties of abstract metric spaces, 
it is worth remarking that our comprehension of the topological properties of metric spaces — and 
of the relationship between topology and geometry has lagged behind: we do not yet have a good 
comprehension of how the structure of a graph interacts with the dimensionality of the shortest-path 
metric induced by it. One such example shows up in a paper [8], where a fairly simple algorithm 
is given for low-distortion Euclidean embeddings of unweighted trees whose shortest-path metric is 
doubling — however, extending the result to embed weighted trees (also with doubling shortest-path 
metrics) requires significantly more work. This raises the natural question: given a doubling tree 
metric M = (V,d), is there an unweighted tree G = (V',E') whose shortest-path metric is also 
doubling, and contains M as a submetric? In fact, the situation is even more embarrassing: we do 
not know the answer even if we drop the requirement that G be a tree, and look for any unweighted 
graph! 

An immediate obstacle to answering these question is the observation that subdividing the edges of 
a weighted tree to convert it into an unweighted tree can increase the dimension unboundedly. For 
example, take a star Ki >n , and set the length of the i th edge {vo,Vi} to be 2\ It is easy to check 
that the metric do has constant doubling dimension; however, subdividing the i th edge into 2* parts 
to make it unit-weighted creates a new graph with n points at unit distance from each other, which 
has a doubling dimension log n that is unbounded. On the positive side, it is easy to show that this 
metric can be embedded into the real line with distortion 2 (e.g., the map V{ \— > 2*), which we can 
subdivide without altering the doubling dimension. In this paper, we show that this positive result 
is not an aberration: any tree metric can be represented as a submetric of an unweighted tree metric 
which has almost the same doubling dimension. We show a similar result for arbitrary graphs as 
well, and show that our tradeoff between distortion and the dimension blowup is asymptotically 
optimal. 

Formal Definitions: To define the problems we study, let us define the convex closure of a graph, 
which is an extension of the notion of subdividing edges. Given a graph G = (V,E) with edge 
lengths £ : E — > M>o, assume that the names of the vertices in V belong to some total order (V, -<). 
Let Vc be the uncountably infinite set of points V U {e[x} \ e £ E,x G (0,£(e))} obtained by 
considering each edge as a continuous segment of length £(e). Let Mq = (V,da) be the shortest- 
path metric of the graph G: we can define a natural metric on the set Vq as 

da(e[x], e'[y]) = min{x + d(u, u) + y, x + d(u, v') + (£{e') - y), 

{1(e) -x)+ d{u', v) + y, {£{e) - x) + d(u\ v') + (£{e') - y)}, 

if e = {u,v} (with u ~< v) and e' = {u',v'} (with u' -< v'). We now define the convex closure 
of the graph G to be the metric space conv(G) = Mq = (Vcdc)- Note the metric obtained by 
subdividing edges of G is a sub-metric of the convex closure of G, and hence it suffices to study 
the doubling dimension of this convex closure conv(G). 
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1.1 Our Results 



The example of K\ n with exponential edge weights shows that even if the shortest-path metric 
Mq of a graph G is doubling, its convex closure Mq may not be doubling. The goal of this paper 
is to show that despite this, there is a "close- by" graph G' whose convex closure Mqi is indeed 
doubling. In particular, the main theorem is the following: 

Theorem 1 (Main Theorem) Given a graph G = (V, E) with specified edge-lengths, we can 
efficiently find a graph G' = (V,E f ) (also with non-negative edge-lengths) such that 

• The distances in G and G' are within a multiplicative factor of (1 + e) of each other, and 

• 7/dim(M G ) = k, then dim(M G /) = O(k), and dim(conv(G")) = O^loge" 1 ). 

Since Theorem [T] does not give any guarantees about the topology of the graph G', we prove an 
analogous result about tree metrics, with improved guarantees on the dimension: 

Theorem 2 Given a tree T = (V, E) with specified edge-lengths, we can efficiently find a tree 
T' = (V',E') with V C V 1 (and with non-negative edge-lengths) such that 

• For x, y £ V , the distance between them in T and T' are within a multiplicative factor of 
(1 + e) of each other, and 

• 7/dim(M T ) = k, then dim(M T /) = 0(k), and dim(conv(T')) = 0(k + log log e" 1 ). 

As a corollary of this result, we obtain an independent proof of the following result about embed- 
dings of doubling tree metrics into l v spaces: 

Corollary 3 (|8j) Every (weighted) doubling tree metric embeds into £ p with constant distortion 
and constant dimension. 

(Another proof of this embedding result for doubling trees appears in [20], using completely different 
techniques.) 

In addition, we show that the tradeoff between the distortion and the dimension of the convex 
closure shown in Theorem [2] is asymptotically optimal: 

Theorem 4 There exists a tree metric T = (V,E) with dim(Mr) = 0(1) such that for any tree 
metric T' = (V',E') with V C V' , the following holds. If dx{u,v) < dT>(u,v) < (1 + e)dT(u, v) for 
all u,v £ V, then dim(conv(T')) must be il(logloge _1 ). 

For general graphs, we show that our tradeoff is asymptotically optimal, under the restriction that 
the graph G' is defined on the same vertex set as G, i.e. we do not use any steiner points. 

Theorem 5 There exists a metric G = (V,E) with dim(Mc) = 0(1) such that for any graph 
G' = (V,E f ), the following holds. If dc(u,v) < dc(u,v) < (1 + e)dc(u, v) for all u, v £ V, then 
dim(conv(G')) must be Jl(loge _1 ). 

Bibliographic Note. James Lee informs us that a weaker form of Theorem Q] can be inferred 
from results in a paper of Semmes [21]. Indeed, the techniques of that paper imply that for every 
doubling metric G, there is a graph G' whose convex closure is also doubling; however, the resulting 
distortion between distances in G and G' using this approach seems to depend on dim(G), and it 
is not clear how to reduce this distortion to (1 + e) as in the results above. 
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1.2 Related Work 



The notion of doubling dimension was introduced by Assouad 1J and first used in algorithm design 
by Clarkson [5]. The properties of doubling metrics and their algorithmic applications have since 
been studied extensively, a few examples of which appear in [8j [T71 (THJ [251 QUI El dU 13 [16] . 

Somewhat similar in spirit to our work is the O-extension problem |13|. \3\ 17]. Given a graph G, the 
O-extension ( cf. Lipschitz Extendability [12\ [22"1 [T9] ) problem deals with extending a (Euclidean) 
embedding of the vertices of the graph to an embedding of the convex closure of the graph, while 
approximately preserving the Lipschitz constant of the embedding. Our results can be interpreted 
as analogues to the above where the goal is to approximately preserve the doubling dimension. 

A number of papers have dealt with geometric implications of topological properties of the graph 
inducing the metric, e.g. when the graph is planar [3 [23], outer-planar [9], series-parallel [SJ, or a 
tree [21]. 

2 Preliminaries and Notation 

Given a graph G, the shortest path metric on it is denoted by do and we shall use Bg(x, r) to denote 
the "ball" {y S Vq '■ da(x,y) < r}. We will often omit the subscript G when it is obvious from 
context. There are several ways of defining the doubling constant A and the doubling dimension 
dim for a metric space, all of them within a constant factor of each other: here is the one that will 
be most useful for us. 

Definition 6 (Doubling Constant and Doubling Dimension) A metric space (X, d) has dou- 
bling constant A if for each x £ X and r > 0, given the ball B(x,2r), there is a set S C X of size 
at most A such that B(x,2r) C U yG s-B(y, r). The doubling dimension dim((A, d)) = log 2 A. 

Fact 7 (Subset Closed) Let metric M = (V,d) have doubling dimension k. If X' C X, and 

d! = d\x'xX'} then (X',d') has doubling dimension at most k. 

Fact 8 (Small Uniform Metrics) If a metric M = (V,d) has doubling dimension k then there 
exists a point x and a radius r such that the ball B{x,r) contains at least 2 k points with interpoint 
distances at least r/2. 

Given a metric (X, d), an r -packing is a subset PCX such that any two points in P are at least 
distance r from each other. An r-covering is a subset C Q X such that for each point x £ X, there 
is a point c € C at distance d(x, c) < r. An r-net is a subset N C X that is both an r-packing and 
an r-covering. 

Fact 9 ("Small" Nets) Let metric M = (V,d) have doubling dimension k, and N is an r-net of 
M , then for any x € V and radius R, the set B(x, R) fl N has size at most (4R/r) k . 

Definition 10 (Geodesic Metrics) A metric (X,d) is said to be geodesic if for every u,v G X, 
u ^ v, there is a continuous map f uv : [0, d(u, v)] — > X such that / U1; (0) = u, f uv (d(u,v)) = v and 
d{fuv(x),fuv(y)) = \x -y\ for any x,y £ [0,d(u,v)]. 

Fact 11 For any graph G = (V,E), the metric conv(G) is a geodesic metric space. 
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3 A Structure Theorem 



In this section, we show how to characterize the dimension of the convex closure of a graph H in 
terms of some easier-to-handle parameters of the graph. 

Definition 12 (Long Edges) Given a graph H = (V, E) , a vertex u G V and a radius r > 0, call 
an edge e = {v,w} a long edge with respect to u,r if one endpoint of e is at distance at most r 
from u, and 1(e) > r. 

Let the set of long edges with respect to u, r be denoted by L u (r). The following structure theorem 
gives us a characterization of the doubling dimension of conv(H) in terms of the number of long 
edges. 

Theorem 13 (Structure Theorem) There exist constants c\ and C2 such that the following 
holds. Consider any graph H = (V, E), and any k > dim/// if the number of long edges \L u (r)\ < 2 k 
for every u G V and every r > 0, then the doubling dimension of the convex closure conv(i^) is at 
most c±k. Moreover, if the doubling dimension of the convex closure conv(H) is at most k, then 
for every vertex u G V and every radius r > 0, the number of long edges \L u (r)\ < 2 C2k . 

Proof: Suppose the number of long edges \L(u, r) \ is at most 2 k for every u, r, then we show that 
for any u £ conv(H) and any r > 0, the ball B conv ^(u, 2r) can be covered by at most 2°^ balls 
B com (H)(y-, | r )- Repeating this argument three times (since (|) 3 < this suffices to prove that 
the doubling dimension of conv(H) is 2 0( - k \ 

First, consider u G V(H). From the definition of doubling dimension, there is a set S C Vh of 
size at most 2 2dimG < 2 2k such that V H (1 B(u,2r) C U yeS B(y,%). Let S' = S U {e[r] \ e G 
L u (r)} U {e[Z(e) - r] \ e G L u (r)}. Clearly, \S'\ < 2° (k h We shall show that B(u,2r) is contained 
in U yeS 'B(y, |r). 

Let e[x] G B(u,2r), where e = {v,w} such that d(u,e[x]) = d(u,v) + x' where x' G {x, 1(e) — x}. 
Assume v -< w (the other case is similar) so that x' = x. If x < r, then consider y G S such that 
v G B(y, |). Clearly d(y,e[x]) < d(y,v) + x < |r. Hence e[x] G B(y, |r). On the other hand, if 
x > r and e[x] G B(u, 2r), then d(u, v) = d(u, e[x]) — x < r so that the edge e is long with respect 
to (u, r). Thus e[r] G S' and since < x < 2r, we conclude that e[x] G B(e[r],r). Thus for any 
ueV H , B(u,2r) CU yeS >B(y,^r). 

Finally, we have to consider balls around vertices in conv(H) \ V(H): note that for e = {u, v}, 
B(e[x],2r) C B(u, 2r) U B(v, 2r) U {e[z\ \ max(0, x — 2r) < z < min(Z(e), x + 2r)}. By the argument 
above, the first two can be covered by a 2°^ balls of radius r each. The subset of e in B(e[x],2r) 
is one dimensional and thus can be covered by two balls of radius r each. This completes the 
argument showing that if the number of long edges is small, the convex completion has a small 
doubling dimension. 

For the converse, we shall show that dim(conv(i?)) > il(logmax M)r L u (r)). Indeed consider the set of 
points W = {e[§] | r G L u (r)}. It is easy to see that W C B(u, 2r) but the balls {B(w, §) | w G W} 
are all disjoint. Thus dim(conv(if)) > ^ log \W\ = \ log \L u (r)\, whence the claim follows. ■ 

The following simple result follows immediately. 

Corollary 14 For any n point metric (X,d), there exists a geodesic metric (X',d') that contains 
an isometric copy of (X,d) and has doubling dimension at most O(logn). 
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The example of K\ n with exponential edge weights shows that this bound is tight, even when 
(X, d) itself has constant doubling dimension. In the following sections, we show that when (X, d) 
indeed has small doubling dimension, the O(logra) bound above can be improved considerably if 
one allows a small distortion. 

4 Convex Completions for Graphs: Proof of Main Theorem 

In this section, we show how to take a graph G = (V, E) and obtain a graph G' = (V, E') on the 
same vertex set, which has (almost) the same distances as in G, but whose doubling dimension 
does not change by much under taking the convex closure. In particular, we use a bounded-degree 
spanner construction due to Chan et al. [lj: they give an algorithm that given a metric (V,d) with 
dimension dim = dim(G) and a parameter e < 1/4, outputs a spanner G' = (V,E') such that 
d(x,y) < dc>(x,y) < (1 + e) d(x,y) for all pairs x,y € V, and moreover the degree of each vertex 
x £ V is bounded by e~°( dim G). "We show that the convex closure of this spanner has doubling 
dimension of 0(dimc loge -1 ). 

4.1 The Spanner Construction 

We start with a graph G and carry out a series of transformations to obtain graph G' . Let e < \ 
be given and let r = 6 + [log(-)~|. Without loss of generality, the smallest pairwise distance in G 
is at least 2 r . We start with some more definitions. 

Definition 15 (Hierarchical Tree) A hierarchical tree for a set V is a pair (T, <f>), where T is a 
rooted tree, and <f> is a labeling function <j> : T — > V that labels each node of T with an element in 
V , such that the following conditions hold. 

1. Every leaf is at the same depth from the root. 

2. The function <f> restricted to the leaves of T is a bijection into V. 

3. If u is an internal node ofT, then there exists a child v of u such that <f>(v) = <fi(u). This 
implies that the nodes mapped by 4> to any x G V form a connected subtree ofT. 

Definition 16 (Net-Tree) A net tree for a metric (V, d) is a hierarchical tree (T, cfi) for the set 
V such that the following conditions hold. 

1. Let Ni be the set of nodes ofT that have height i. (The leaves have height 0.) Let ro = 1, and 
Ti+x = 2ri, for i > 0. (Hence, = 2 l .) Then, for i > 0, <fi(Ni+i) is an n+i-net for 0(iVj). 

2. Let node u G Ni, and its parent node be p u . Then, d(<f>{u) , (f>(p u )) — r i+i- 

It is easy to see that net-trees exist for all metrics, and Har-Peled and Mendel show how to construct 
a net-tree efficiently |10| . 

To construct their bounded-degree spanner, Chan et al. [4] define the following: suppose we are 
given a graph G = (V,E), whose shortest-path metric (V,dc) has doubling dimension dime. Let 
e > and (T, <f>) be any net tree for M. For each i > 0, let 

( 32 i 

E i :={{u,v}\u,ve ( f>(N i ),d G (u,v)<(4 + — )-nj\ [j Ej, (4.1) 

j<i-l 
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where Eq is the empty set. (Here the parameters Ni,ri are as in Definition [TBI ) Letting C £ denote 
(4 + we note that all edges in Ei have length in (C e r%-i, C e ri\. 

While the graph G = (V,E = UjFj) is a (1 + e)-spanner for the original metric with few edges, 
obtaining a bounded-degree spanner requires some modifications to the basic construction. First, 
the edges in E are directed (merely for the purposes of the algorithm, and the proof). For each 
v G V , define i*(v) := max{i|-u G 0(JVj)}. For each edge (u,v) G E, we direct the edge from u to v 
if i*(u) < i*(v). If i*(u) = i*(v), the edge can be directed arbitrarily. Chan et al. show that each 
vertex x G V has out-degree bounded by f3 = e~°( dimG \ Then, the following steps are performed: 

• Consider any vertex x, and all the edges that are directed into x. These edges come from 
various sets Ef. let us denote by F, = Fi(x) the subset of edges directed into x that belong 
to E { . 

• Suppose the non-empty subsets are Fj^F^, . ..,Fj t , where ij < ij+i- We do nothing to the 
first 71oge _1 of these edge sets; these contribute £-°( dim G) to the final degree of x. 

• Consider a value of j > 71oge _1 : from the set ^iy_ 71oge -i, °f edges directed into x, we choose 
an arbitrary one {u, x}. We replace edges of the form {y, x} G Fj. by edges {y, u} — and refer 
to these (at most £~°( dim G)) edges as edges donated from x to u. 

Note that the length of the edge {u, x} is at most C £ 2 l O- 71 °g £_1 ) < C £ e 7 2 l , whereas the length 
of any edge in {y,x} G Fi j is at least C £ 2 l ~ 1 ; hence dc(u,x) < (e 7 /2)dc(x, y) < e 6 dc(x,y), 
since e < 1/4. By the triangle inequality, dc{u,y) G (1 ± e 6 )dc{x,y). 

Additionally, note that if x donates a long edge (x,y) G Fj. to u, then (u,x) G F _ x so 
that dc(x,u) is at least C £ 2 l ^~ 71 °ss~ 1 ' _ , 

Theorem 17 (|4]) T/ie spanner thus constructed has degree £~°( dim G) an d stretch (1 +e). 

From the construction of the bounded-degree spanner, note that each vertex u (zV has the following 
edges incident to it: 

• Type-A edges. These correspond to the £ - °( dim G) edges that were directed away from u. 

• Type-B edges. These correspond to the edges directed into u that belong to the smallest 
71oge _1 levels; this gives another (£~°( dim G)) edges in total. 

• Type-C edges. For each edge e = {u, x} of type-A incident to u, there are at most 
^ £ -0(dim G )^ other edges incident to u that are not counted above. Each such edge e' = {y,u} 
corresponds to some edge of the form {y,x} G Ei (for some i such that x,y G (j)(Ni)), such 
that the edge was "donated" from x to u to maintain x's degree bound. 

4.2 Bounding the Dimension of the Convex Closure 

Simply by the distortion bound, it follows that the doubling dimension of the bounded-degree 
spanner G' is close to dime*. Of course, the bounded-degree does not imply that conv(G') has low 
doubling dimension: in this section, we use the Structure Theorem 1131 to show this fact, and hence 
prove Theorem [TJ 

Lemma 18 Given the graph G' defined as above, fix any vertex v and radius R, and e < \. Then 
the number of long edges |L„(i?)| with respect to v,R is at most 0(e~ ^ dirnG '). 
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Proof: Recall that L V (R) is the set of edges that have one endpoint within the ball B(v,R), and 
have length at least R. Define £ G Z> such that R G (C e 2^ _1 , C £ 2 e ]. 

By the spanner construction, any type- A or type-B edge that is long must belong to Uj>^£'j, and 
hence must have both endpoints in 0(A^). Moreover, one endpoint of each such a long edge must 
lie in the ball B(v, R) C B(v, C £ 2 £ ); since the points in <j>(Ne) are at distance at least 2 £ from each 
other, there can be at most (C e ) ( dimG ) many such endpoints within the ball. Moreover, each one 
of these endpoints has at most £~°( dim G) type-A or type-B edges; multiplying them together, using 
the fact that C £ = 0(e^ 1 ), and simplifying gives an upper bound of the number of 

type-A and type-B edges in L V (R). 

Let us now consider the edges in L V (R) that are of type-C with respect to their endpoint within 
B(v, R). Recall that each type-C edge {u, y} can be associated with some edge {x, y} £ E (of almost 
the same length — up to a factor of (lie 6 )) such that x donates the edge to u. Let us fix one such 
long edge e = {u, y} associated with {x, y} G — hence the distance dc(x, y) G (C e 2 l , C £ 2 l ], and 
also x,y G <fr(Ni). By the construction of the type-C edges, the distance da(u,x) < e 6 • dc(x,y), 
and hence x lies in the ball B(v, R + e e C £ 2 l ). 

Given any fixed level i > I — 1, the number of donor vertices is bounded by the number of points 
in B(v,C £ (2 £ + e 6 2 4 )) that are at least 2 l distance apart from each other, which can be loosely 
bounded by e _ °( dim <3). Each such donor vertex could donate £~°( dim G) edges, which would give 
us a total of e -°( dim G) edges for the level i. Summing this over all levels would give us too many 
edges, so we use this bound only for levels i such that £ — 1 < i < £ + 0(loge _1 ). 

Consider any level i > £+6 log e _1 : any donor vertex for such a level must lie in the ball B(v, C £ (2 i + 
e 6 2')) C B(v,C £ s 6 2 i+1 ) C B(v,C £ e 5 2 i ). A little algebra shows that 

^C £ = e"(A + ^)<e^<e i -^<e, 

and thus the donor vertex must be at distance at most e2 l from v. However, since the donor vertex 
must belong to (j)(Ni), it must be at distance at least 2 % from any other donor vertices. Now, if 
there were two donor vertices at distance e2 l from v, they would be at distance 2e2 l < 2 l from each 
other — this implies that there can be at most one donor vertex for such a "high" level. 

Finally, it remains to show that the total number of long edges donated by this donor vertex x to 
vertices in B(v,R) is small. Let i\, 12, ■ ■ ■ , it, ij < ij+i be the levels for which x donates a long 
edge to vertices in B(v,R); we shall show that t is at most 0(loge _1 ). Since the first edge is long, 
R < C £ 2 il+1 . Moreover, since x donates this edge to u, we conclude that da(x,U\) < e 6 C e 2 il , so 
that d G {v,x) < R + e & C £ 2^ < (2 + e 6 )C E 2 il . Suppose that t > Tloge" 1 + 3. Then an edge in F it is 
donated from x to u t , and we have that dc{x, ut) > C e 2* 4 ~ 1 . On the other hand, since ut G B(v, R), 
by triangle inequality, da(x,ut) < da(x,v) + dc(v,ut) < (3 + e 6 )C £ 2 n . Since 14 > i\ + 3, this gives 
us the desired contradiction. Thus t < 0(loge _1 ). Since there are at most £~°( dim G) edges donated 
to B{v,R) from each of these levels, the claim follows. ■ 

Using Lemma [T8l along with the Structure Theorem 1131 implies that the dimension of conv(G') is 
bounded by 0(dimc loge -1 ), which proves Theorem [TJ 
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5 Convex Completions for Trees 



The construction of the previous section showed that given any graph G, we could construct a new 
graph G' such that distances in G and G' are within (1 + e) of each other, and conv(G') has low 
doubling dimension. However, since the construction starts with the shortest-path metric da and 
completely ignores the topological structure of G itself, it is not suited to proving Theorem [2] which 
seeks to start with a tree and end with another tree. In this section, we show a different approach 
that allows us to monitor the graph structure more closely. 

5.1 The Construction for Trees 

We give a procedure that takes a general graph G and outputs a graph G' (since the construction 
itself does not depend on G being a tree); we then show some properties that hold when G is a tree. 
The procedure takes a graph G = (V,E), and constructs a new graph G' = (V',E') with V C V' 
(by way of an intermediate graph G) as follows. Define an exponential tail with k edges as a path 
P = (vo, vi, t>2, • • • j Vk), where the length of the edge {v^i, Vi} is 2\ Without loss of generality, the 
smallest edge length in G is at least 2 T , where r = 6 + [~log(~)~|. 

We construct the graph G' in the following way: 

• As in Section 14.11 we consider a net-tree (T, (j>) for the graph G. If iV, is the set of nodes in 
T at height i, then for u G V define i*(u) to be the largest i such that u G 0(iVj). Attach to 
each u G V an exponential tail with i*{u) edges; refer to the j th vertex on this path as uu, 
with -up] = u. Let G be this intermediate graph consisting of G along with the tails. 

• Consider an edge e = {u, v} G E(G), and suppose its length lies in the interval (C e 2 ,_1 , C £ 2 l \. 
Some leaf of T must be mapped by (ft to u G V: let the level-(z) ancestor of that node be 
mapped by 4> to u; similarly, define v be defined for v. We now make an edge {S^Wji} of 
length £ e in the graph G'. 

Note that if we start off with a tree T, the above procedure adds exponential tails to T to get the 
intermediate graph T, and then "moves the edges up the tails" to get the final graph T' . 

Proposition 19 (Distance Preservation) Let e < 1/4. If the input graph is a tree T = (V,E), 
then the above procedure results in a connected tree T' = (V',E') such that for any x,y E V, 

(l + e)-Vr(s,y) <<h>(x,v) < (1 + e)d T (x,y). 

Proof: Let us consider performing the above-mentioned transformation for edges in increasing 
order of edge-length. Given j G Z> , let Tj be the forest formed by deleting all edges of length 
more than C £ 2- J from T; also, let Tj be the forest formed by deleting the corresponding edges in T'. 
We will prove by induction on j that for all x, y that lie in some connected component in Tj, their 
distance in Tj will satisfy the desired stretch bound. The base case is trivial, since all components 
of To have single nodes in them. 

To prove the claim for j, we inductively assume it for j — 1, Now consider taking some edge 
e = {u, v} of length t e G {C £ 2^~ l , C e 2 J ]. In this case we find some nodes u and v, and add an edge 
of length £ e between uy\ and vrj] ■ By the properties of the net-tree, the distance dr(u, u) < 2 J+1 — 2. 
Since Tj already contains all edges of length at most C e 2- ?_1 , and C £ > 4, the net point u lies in the 
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same component as u in Tj. By the induction hypothesis, dj"(u,u) < (1 + e)2 J+1 ; note that this 
implicitly proves that u and u are in the same component in T'-. A similar claim holds for cIt(v,v). 
Hence the distance in between u and v is at most 

dr* (u,u) + d T > (u, uyj ) + £ e + d T j (uyj , v) + d T j (u , v) 
= 2 x (1 + e)2 i+1 + 2 x 2 i+1 + t e 

<4(8£l+£±8 +1 )<( 1+e)4j 

where we used the fact that C e = (4 + ^) and e < 1/4. Since each of the edges of T are not 
stretched by more than (1 + e), this implies that the stretch for all pairs is bounded by the same 
value. 

We also need to show that the distances are not shrunk too much in T'\ to show this, we go via 
T . (Recall that T was the original tree T along with the exponential tails.) First note that for 
any it, v G V, dT(u,v) = d^{u,v). We show that distance do not shrink in going from T to T'. It 
suffices to show this for the edges of T'. For an edge e' = (2u],uui) that has length l e > C e 2 J ~ , 
we note that their distance in T 

df(uy],vyy) < df(u^,u) +df(u,u) + £ e + dj;(v,v) + df(v, uyj) < 4(2 J+1 — 2) + i e (5.2) 

Since C £ > 32/ e, this is at most (1 + e)£ e . Thus the contraction going from T to T" is at most 
(1 + e). 

Finally, we note that we have shown that T' is connected, and the number of edges in T 1 is equal 
to the number of edges in T, which is a tree. Thus T' is a tree as well. ■ 

5.2 Bounding the Dimension of the Convex Closure: The Tree Case 

Finally, to show that the doubling dimension of conv(T') is small, we will again invoke Theorem 1131 
However, since we have added additional vertices in going from T to T", we first show that dim(T') 
is 0(dim(T)). Since we have already shown that distances are preserved in going from T to T', it 
suffices to bound the doubling dimension of T. 

Lemma 20 The doubling dimension ofT is at most 0(dim(T)). 

Proof: Let u [{] G V(f) and R > with R G (2 j - 1 ,2 j }. We wish to show that B{u^2R) can 
be covered by a small number of balls of radius R. From the definition of doubling dimension, it 
follows that there is a set Y with \Y\ < 2 2dim ( T ) such that B T {u, 2R) C \J yeY B T {u : R/2). Note that 
for any v 0(iVy_2), the tail attached to v has length at most R/2. Let Z = B(u,2R) n 4>{Nj^2)\ 
clearly \Z\ < 2°( dim ( T )). Finally, let Z' = {«y_i] : v G Z} and Z" = {v {j] : v G Z}. It is easy to 
verify that B(um,2R) C U y( zYuzuZ'uZ"B(y, R). The claim follows. ■ 

Finally, we show the following bound on the number of long edges in T' . 

Lemma 21 (Few Long Edges) For any vertex v G T' and every radius R, the number of long 
edges in T' is bounded by 2°( dim ) loge -1 . 
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Proof: First consider some v G V, and R > 0, and define £ G Z>o such that R G {C £ 2 l ~ x , C £ 2 £ ]. 
Every long edge incident on B{v , i?) must have length at least R. Further, edges longer than 2C £ R 
are incident on a tail node further than R from its root, and hence such an edge cannot be incident 
on B(v,R). For each of the length scales (C £ 2 e+: >~ 1 , C £ 2 £+: >) : < j < logC e , we will bound the 
number of long edges in that length scale. Fix one such scale, and let L(v , R, j) = {(ui,Wi) : 1 < i < 
\L(v,R,j)\} be the set of long edges of length in (C e 2^ + - ?_1 , C £ 2 e+ ^), such that d(v,Ui) < R. Since 
each long edge has length more than R, there is a path from v to Uj that does not use any of the 
long edges. Consider the set of nodes W = {wi : 1 < i < \L(v,R,j)\}. Clearly, for any w,w' G W, 
d(w,w') is at most 2R + 2C £ 2 l +i < 4:C e 2 i+j . Moreover, since T is a tree, the symmetric difference 
of the v-w and v-w' paths gives the shortest path from w' to w. Since the long edges incident on 
w and w' are in this symmetric difference, we conclude that d(w,w') > 2C £ 2 e+ i~ 1 . Thus from the 
bound on doubling dimension, we conclude that \W\ < 2°( dim ). Adding the contribution of the 
O^oge^ 1 ) distance scales, we get the desired bound. 

We now extend the argument to a vertex on an exponential tail hanging off v. If i > j, then 
B(vw,R) = All edges incident on v have, up to a factor of two, the same length, and thus 

their endpoints form a near uniform submetric. Thus we can bound the degree of %] by 2°( dim ) 
and the claim follows. On the other hand, when i < j, B(v^,R) C B(v,2R) and an argument 
analogous to the one for the case v G V above suffices. ■ 

Theorem [2] follows. 

6 Lower Bounds 

In this section, we show that the tradeoff between distortion and dimension blowup is asymptotically 
optimal. Consider the graph K\^ n with vq as the center node and {vi, . . . , v n } as the set of leaves. 
Set the length of the edge {vo,V{} to 2 l and let d be the resulting metric on the vertices V of 
K\^ n . It is easy to check that this metric has constant doubling dimension. We next show that the 
doubling dimension of any geodesic metric (X, d') containing a (1 + e)-distortion copy of (V, d) is 
r2(log loge -1 ). 

Lemma 22 Let (X,d') be any geodesic metric such that V C X and d(vi,Vj) < d'(vi,Vj) < (1 + 
e)d(vi,Vj) for all Vi,Vj G V. Then dim(A, d') is r2(log log e _1 ). 

Proof: Denote by ui<;[x] the point on the shortest u-w path in X that is at distance x from u (if 
there is more than one shortest path, pick one arbitrarily). We shall argue that the points Uo^i[l] 
for i = {1, . . . ,log(2e)^ 1 } form a large near-uniform submetric in X. Indeed d'(voVi[l], voVj[l]) < 
d'(voVi[l], vq) + d'(vQ, voVj[l\) = 2. On the other hand, by triangle inequality, 

d'(voVi[l\,v Vj[i\) > d'(vi,Vj) - d'(v Vi[l],Vi) - d'(v Vj[l],Vj) 

= d'(vi,Vj) - (d'(v ,Vi) - 1) - (d'(v ,Vj) - 1) 
> 2 + d(vi,Vj) - (1 +e)(d(v ,Vi) +d(v ,Vj)) 
= 2-e(2 i + 2i) 

where we have used the bound on the distortion and the distance definitions in d in the last two 
steps. Since i,j < log(2e) _1 , we conclude that d'(voVi[l], VQVj[l]) > 1. Thus we have log(2e) _1 
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points in X that lie within B(vq,2) no two of which can be covered by a single ball of radius ?j- 
Thus the doubling dimension of X is il(logloge _1 ). ■ 

Theorem H] follows. 

For general metrics, we show a stronger lower bound, under a stronger constraint on X. Let 
V = {0, 1} P with d(x,y) = 2 p ~ lcp<yX,y \ where lcp(x,y) denotes the length of the longest common 
prefix of strings x and y. Once again, one can easily check that (V, d) has constant doubling 
dimension. We show that any graph H = (V, E) on V approximating d within distortion (1 + e) 
must satisfy dim(conv(i?)) € f2(loge _1 ). 

Lemma 23 Let H = (V, E) be any graph such that the shortest path metric d' satisfies d(x, y) < 
d'{x,y) < (1 + e)d(x,y) for all x,y £ V. Then dim(conv(ff)) is Q(loge _1 ). 

Proof: For p = log(2e) _1 , we first show that H must have all edges connecting Vq = {Ox : x £ 
{0, l}^ 1 } and Vi = {lx : x G {0, l}^ 1 }. Indeed, suppose that edge (Ox, ly) H. Then the 
shortest path in H between Ox and ly must be of length at least 2 P + 1. This however violates the 
distortion constraint. Now consider the set of points A = {e[2 p_1 ] : e = (Ox, ly,x,y G {0, 
Clearly for any a, b G A, d(a, b) < 3 • 2 P ~ 1 and d(a, b) > 2 • 2 P_1 . The claimed bound on the doubling 
dimension follows. ■ 

Theorem [5] follows. 
Acknowledgments 

We thank James Lee for pointing out that a weaker version of Theorem [T] could be inferred from 
Semmes' results. We also thank Robi Krauthgamer and Ravishankar Krishnaswamy for discussions. 



References 

[1] P. Assouad. Plongements lipschitziens dans R n . Bull. Soc. Math. France, 111(4) :429-448, 
1983. 

[2] A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. In The 23rd 
International Conference on Machine Learning (ICML), 2006. 

[3] G. Calinescu, H. Karloff, and Y. Rabani. Approximation algorithms for the 0-extension prob- 
lem. In Proceedings of the twelfth annual ACM- SI AM symposium on Discrete algorithms, pages 
8-16. ACM Press, 2001. 

[4] H. T.-H. Chan, A. Gupta, B. M. Maggs, and S. Zhou. On hierarchical routing in DOubling 
metrics. In Proceedings of the 16th ACM-SIAM Symposium on Discrete Algorithms (SODA), 
pages 762-771, 2005. 

[5] K. L. Clarkson. Nearest neighbor queries in metric spaces. Discrete Comput. Geom., 22(1):63- 
93, 1999. 

[6] R. Cole and L.-A. Gottlieb. Searching dynamic point sets in spaces with bounded doubling 
dimension. In The thirty-eighth annual ACM symposium on Theory of computing (STOC), 
2006. 



11 



[7] J. Fakcharoenphol, C. Harrelson, S. Rao, and K. Talwar. An improved approximation al- 
gorithm for the O-extension problem. In Proceedings of the fourteenth annual ACM-SIAM 
symposium on Discrete algorithms, pages 257-265. Society for Industrial and Applied Mathe- 
matics, 2003. 

[8] A. Gupta, R. Krauthgamer, and J. R. Lee. Bounded geometries, fractals, and low-distortion 
embeddings. In Proceedings of the ^th Symposium on the Foundations of Computer Science 
(FOCS), pages 534-543, 2003. 

[9] A. Gupta, I. Newman, Y. Rabinovich, and A. Sinclair. Cuts, trees and £i-embeddings of 
graphs. Combinatorica, 24(2):233-269, 2004. (Preliminary version in 40th FOCS, 1999.). 

[10] S. Har-Peled and M. Mendel. Fast constructions of nets in low dimensional metrics, and their 
applications. In Proceedings of the twenty-first annual symposium on Computational geometry, 
pages 150-158, 2005. 

[11] P. Indyk and A. Naor. Nearest neighbor preserving embeddings. In ACM Transactions on 
Algorithms (To appear). 

[12] W. B. Johnson, J. Lindenstrauss, and G. Schechtman. Extensions of lipschitz maps into banach 
spaces. Israel J. Math., 54(2):129-138, 1986. 

[13] A. Karzanov. Minimum 0-extensions of graph metrics. European Journal of Combinatorics, 
19(1):71-101, 1998. 

[14] P. Klein, S. A. Plotkin, and S. B. Rao. Excluded minors, network decomposition, and mul- 
ticommodity flow. In Proceedings of the 25th ACM Symposium on the Theory of Computing 
(STOC), pages 682-690, 1993. 

[15] G. Konjevod, A. W. Richa, and D. Xia. Optimal-stretch name-independent compact routing 
in doubling metrics. In The twenty-fifth annual ACM symposium on Principles of distributed 
computing, 2006. 

[16] G. Konjevod, A. W. Richa, and D. Xia. Optimal scale-free compact routing schemes in doubling 
networks. In Proceedings of the 18th ACM-SIAM Symposium on Discrete Algorithms (SODA), 
2007. 

[17] R. Krauthgamer and J. R. Lee. The intrinsic dimensionality of graphs. In Proceedings of 
the thirty-fifth annual ACM symposium on Theory of computing, pages 438-447. ACM Press, 
2003. 

[18] R. Krauthgamer and J. R. Lee. Navigating nets: simple algorithms for proximity search. 
In Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms, pages 
798-807. Society for Industrial and Applied Mathematics, 2004. 

[19] J. Lee and A. Naor. Absolute lipschitz extendability. Comptes Rendus de I'Acadmie des 
Sciences - Series I - Mathematics, 338(ll):859-862, 2004. 

[20] J. Lee and A. Naor and Y. Peres. Trees and Markov convexity. Geometric and Functional 
Analysis, to appear. Preliminary version in SODA 2006. 



12 



[21] J. Matousek. On embedding trees into uniformly convex Banach spaces. Israel Journal of 
Mathematics, 114:221-237, 1999. (Czech version in : Lipschitz distance of metric spaces, C.Sc. 
degree thesis, Charles University, 1990). 

[22] J. Matousek. Extension of Lipschitz mappings on metric trees. Commentationes Mathematicae 
Universitatis Carolinae, 31(1):99-104, 1990. 

[23] S. B. Rao. Small distortion and volume preserving embeddings for planar and Euclidean 
metrics. In 15th Annual ACM Symposium on Computational Geometry, pages 300-306, 1999. 

[24] S. Semmes. On the nonexistence of bi-Lipschitz parameterizations and geometric problems 
about ^oo-weights. Rev. Mat. Iberoamericana, 12(2):337-410, 1996. 

[25] K. Talwar. Bypassing the embedding: Algorithms for low-dimensional metrics. In Proceedings 
of the 36th ACM Symposium on the Theory of Computing (STOC), pages 281-290, 2004. 



13 



